Dynamic Natural Language Processing with Recurrence Quantification Analysis
نویسندگان
چکیده
Abstract. Writing and reading are dynamic processes. As an author composes a text, a sequence of words is produced. This sequence is one that, the author hopes, causes a revisitation of certain thoughts and ideas in others. These processes of composition and revisitation by readers are ordered in time. This means that text itself can be investigated under the lens of dynamical systems. A common technique for analyzing the behavior of dynamical systems, known as recurrence quantification analysis (RQA), can be used as a method for analyzing sequential structure of text. RQA treats text as a sequential measurement, much like a time series, and can thus be seen as a kind of dynamic natural language processing (NLP). The extension has several benefits. Because it is part of a suite of time series analysis tools, many measures can be extracted in one common framework. Secondly, the measures have a close relationship with some commonly used measures from natural language processing. Finally, using recurrence analysis offers an opportunity expand analysis of text by developing theoretical descriptions derived from complex dynamic systems. We showcase an example analysis on 8,000 texts from the Gutenberg Project, compare it to well-known NLP approaches, and describe an R package (crqanlp) that can be used in conjunction with R library crqa.
منابع مشابه
Dynamic characterization and predictability analysis of wind speed and wind power time series in Spain wind farm
The renewable energy resources such as wind power have recently attracted more researchers’ attention. It is mainly due to the aggressive energy consumption, high pollution and cost of fossil fuels. In this era, the future fluctuations of these time series should be predicted to increase the reliability of the power network. In this paper, the dynamic characteristics and short-term predictabili...
متن کاملCombinatorics & Synchronization in Natural Semiotics
In this study the derivation of an objective metrics to appreciate the degree of structuring of written and spoken texts is presented. The proposed metrics is based on the scoring of recurrences inside a text by means of the application of Recurrence Quantification Analysis (RQA), a non linear technique widely used in other fields of sciences. The adopted approach allowed us to create a ranking...
متن کاملProsody and synchronization in cognitive neuroscience
We introduce our methodological study with a short review of the main literature on embodied language, including some recent studies in neuroscience. We investigated this component of natural language using Recurrence Quantification Analysis (RQA). RQA is a relatively new statistical methodology, particularly effective in complex systems. RQA provided a reliable quantitative description of recu...
متن کاملDynamic Mediation for Removing Language Comprehension Problems: A Psychological Support for Listening Comprehension Mental Processing
Dynamic Assessment is an approach to assessment within Applied Linguistics which is stemmed from Vygotsky’s Socio-Cultural Theory of mind, his concept of Zone of Proximal Development and Feuerstein's theory of Structural Cognitive Modifiability. This study is an attempt to pinpoint the sources of mental processing problems in listening comprehension and applies dynamic interventions to remove t...
متن کاملOrthographic Structuring of Human Speech and Texts: Linguistic Application of Recurrence Quantification Analysis
A methodology based upon recurrence quantification analysis is proposed for the study of orthographic structure of written texts. Five different orthographic data sets (20th century Italian poems, 20th century American poems, contemporary Swedish poems with their corresponding Italian translations, Italian speech samples, and American speech samples) were subjected to recurrence quantification ...
متن کامل